Correlating Pupil Position to Gaze Location Within a Scene

ABSTRACT

Correlating pupil position to gaze location within a scene. At least some of the illustrative embodiments are methods including: receiving, by a first computer system, a first video stream depicting an eye of a user, the first video stream comprising a first plurality of frames; receiving, by the first computer system, a second video stream depicting a scene in front of the user, the second video stream comprising a second plurality of frames; determining, by the computer system, pupil position within the first plurality of frames; calculating, by the first computer system, gaze location in the second plurality of frames based on pupil position within the first plurality of frames; and sending an indication of the gaze location to a second computer system, the second computer system distinct from the first computer system, and the sending in real-time with creation of the first video stream.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a conversion of U.S. Provisional Application Ser.No. 61/705,724 filed Sep. 26, 2012 titled “System and method of headsetassisted cursor control”, which provisional application is incorporatedby reference herein as if reproduced in full below.

BACKGROUND

Eye and/or gaze position tracking systems have many beneficial uses. Forexample, gaze position tracking systems may help disabled persons withcursor position control when using computer systems. Gaze positiontracking may also find use in computer gaming, military applications, aswell as assisting advertisers in gauging advertising placementeffectiveness.

Many gaze position tracking systems require the user's head to be heldsteady. As a result the uses of gaze position tracking systems may notbe efficient in real-world situations, and/or may not provide asufficient range of tracking situations.

BRIEF DESCRIPTION OF THE DRAWINGS

For a detailed description of exemplary embodiments, reference will nowbe made to the accompanying drawings in which:

FIG. 1 shows a system in accordance with at least some embodiments;

FIG. 2 shows a perspective view of a headset in accordance with at leastsome embodiments;

FIG. 3 shows a side elevation view of a headset in accordance with atleast some embodiments;

FIG. 4 shows a front perspective view of a headset in accordance with atleast some embodiments;

FIG. 5 shows an overhead aerial view of a headset in accordance with atleast some embodiments;

FIG. 6 shows a perspective view of a headset on a user in accordancewith at least some embodiments;

FIG. 7 shows, in block diagram form, electronics of a headset system inaccordance with at least some embodiments;

FIG. 8 shows example calibration features in accordance with at leastsome embodiments

FIG. 9 shows a flow diagram of calibration steps in accordance with atleast some embodiments;

FIG. 10 shows an example monitor located within a scene in accordancewith at least some embodiments;

FIG. 11 shows an example monitor located within a scene in accordancewith at least some embodiments;

FIG. 12 shows a flow diagram of an example method of correlating pupillocation to gaze location in accordance with at least some embodiments;

FIG. 13 shows a flow diagram of moving a cursor based on gaze locationin accordance with at least some embodiments.

NOTATION AND NOMENCLATURE

Certain terms are used throughout the following description and claimsto refer to particular system components. As one skilled in the art willappreciate, different companies may refer to a component by differentnames. This document does not intend to distinguish between componentsthat differ in name but not function.

In the following discussion and in the claims, the terms “including” and“comprising” are used in an open-ended fashion, and thus should beinterpreted to mean “including, but not limited to . . . .” Also, theterm “couple” or “couples” is intended to mean either an indirect ordirect connection. Thus, if a first device couples to a second device,that connection may be through a direct connection or through anindirect connection via other devices and connections.

“Real-time” with respect to cursor movement responsive to pupil positionmovement shall mean the cursor movement takes places within two secondsor less of movement of the pupil position movement.

“Light” shall mean electromagnetic radiation regardless of whether thelight resides within visible spectrum or outside the visible spectrum(e.g., infrared).

“Central axis” shall mean an imaginary straight line extending along thecenter of an object (e.g., an elongate body) or the center of a real ortheoretical surface (e.g., viewing direction, light beam), but shall notrequire the object or surface to define rotational symmetry about thecentral axis.

“About” shall mean within five percent (5%) of the recited value.

DETAILED DESCRIPTION

The following discussion is directed to various embodiments of theinvention. Although one or more of these embodiments may be preferred,the embodiments disclosed should not be interpreted, or otherwise used,as limiting the scope of the disclosure, including the claims. Inaddition, one skilled in the art will understand that the followingdescription has broad application, and the discussion of any embodimentis meant only to be exemplary of that embodiment, and not intended tointimate that the scope of the disclosure, including the claims, islimited to that embodiment.

Various embodiments are directed to aspects of an eye tracking headsetsystem. Other example embodiments are directed to an eye trackingheadset with related cursor control (including related software).Various embodiments may also be directed to a novel calibration methodand system that relates pupil position to gaze location in a scene(including gaze location on a computer monitor within the scene). Thespecification first turns to a high level overview.

High Level Overview

FIG. 1 shows a system in accordance with at least some embodiments. Inparticular, the system of FIG. 1 comprises a headset system 102illustratively coupled to the user 104, and a computer system 106communicatively coupled to the headset system 102. The headset system102 comprises a headset 108 configured to couple to the head of the user104, along with belt pack 110 configured to couple to the belt orwaistband of the user 104. The belt pack 110 houses electronic devices(discussed more below) which perform various functions, such as pupilidentification, pupil position tracking, gaze location determination,and calibration procedures. The headset 108 is illustratively shown tocouple to the belt pack 110 by way of cable 112, but wirelesscommunication between the headset 108 and belt pack 110 is alsocontemplated.

The example headset system 102 is communicatively coupled the computersystem 106, and as shown the communicative coupling may be by way of awireless communication protocol or system (e.g., IEEE 802.11 wirelessnetwork, a Bluetooth network). In other situations, however, the beltpack 110, and thus the headset system 102, may communicatively couple tothe computer system 106 by way of a wired or optical connection.

The headset system 102 has a plurality of cameras (not specificallyshown in FIG. 1, but discussed more below) that facilitate pupilposition tracking and gaze location determination. Using an eye camerathat creates a video stream of the user's eye, the headset system 102(and particularly programs executing on a processor in the belt pack110) determine a series of pupil center positions. The headset system102 then applies a homography that relates pupil center position in thevideo stream of the user's eye to the gaze location in a video streamproduced by a forward looking scene camera. The headset system 102 thensends the gaze location information (along with other information) tothe computer system 106.

In the example case of cursor control, the computer system 106 moves orplaces the cursor 114 at location on the display device 116 indicated bythe gaze location. That is, the headset system 102 enables the user 104to visually control position of the cursor 114 on the display device116. In the example case of using the headset system 102 to gaugeadvertising placement, the headset system 102 sends the gaze locationinformation to the computer system 106 such that the computer system106, or an operator of the computer system, may evaluate the objectswithin the scene that garnered the attention of the user, and for howlong. The specification now turns to a more detailed description of theheadset 108.

Headset

FIG. 2 shows a perspective, partial cut-away, view of a headset 108 inaccordance with example embodiments. In particular, headset 108comprises an adjustable head strap 200 defining a parabolic shape, aswell as a first distal end 202 and a second distal end 204. Distal end202 defines an aperture 206 within which a first ear piece member 208(hereafter just ear piece 208) is disposed. In FIG. 2, the aperture 206defines a “tear-drop” shape, though the aperture 206 may be any shapeand size which enables coupling of the ear piece 208. For example, theaperture 206 may be circular, oval, or square. Similarly, the seconddistal end 204 defines an aperture 210 within which a second ear piecemember 212 (hereafter just ear piece 212) is placed. Like aperture 206,the aperture 210 in the example system defines a “tear-drop” shape,though the aperture may be any suitable shape and size.

Thus, the example headset 108 comprises two ear pieces 208 and 212;however, in other cases the headset 108 may comprise only one ear piece(such as ear piece 208). In cases where only one ear piece is used, thedistal end of the head strap 200 which does not couple to an ear piecemay be configured to rest against the side of the head of user 104 toaid in holding the headset 108 in place.

In some example systems, the length of the head strap 200, as measuredbetween the ear pieces 208 and 212 along the head strap 200, may beadjustable. In particular, in the example system outer portions of thehead strap 200 may telescope into the center portion of the head strap200. For example, outer portion 214 may telescope into and out of thecenter portion of the head strap 200, as shown by double-headed arrow216. Likewise, outer portion 218 may telescope into and out of thecenter portion of the head strap 200, as shown by double-headed arrow220. The adjustable length of the head strap 200 thus enables the user104 to adjust the headset 108 so that the ear pieces 208 and 212 fitover the user's ears and the head strap 200 abuts the top of the user'shead. Adjustment of the head strap 200 enables a single headset 100 tobe usable for a plurality of user of different ages and/or head sizes.

Each of the ear pieces 208 and 212 may be of a circular shape. In theexample systems, each ear piece is approximately three inches indiameter and one inch thick; however, the ear pieces may be anydiameter, thickness, and/or design that enables operation of the methodsas described herein. Each ear piece may comprise a padded portion on theinside surface (closest to the user's head) for wearing comfort and toreduce ambient noise. Thus, ear piece 208 has a padded portion 222 (onlypartially visible), and ear piece 212 has padded portion 224. Referringto padded portion 224 as representative of both padded portions, thepadded portion 224 defines an internal aperture 226, within which theuser's ear resides during use.

In the example system, each ear piece 208 and 212 comprises a speakersuch that audible sounds may be conveyed to the ears of the user (e.g.,sounds generated by programs executing on computer system 106). Thespeakers are not specifically shown in FIG. 2, but the speakerassociated with ear piece 212 resides directly behind apertures 228. Asimilar arrangement regarding a speaker may exist for ear piece 208.

Still referring to FIG. 2, the headset 108 may further comprise an armmember 230 (hereafter just arm 230). The arm 230 is coupled to the headstrap 200, and in example situations (and as shown) the arm couples tothe head strap 200 by way of ear piece 208. While the example arm 230couples to the “right” ear piece 208, the arm 230 may equivalentlycouple to the “left” ear piece 212 (with the arm 230 of appropriatelymirrored construction). The arm 230 defines a proximal portion 232, amedial portion 234, and a distal portion 236. The proximal portion 232couples to the example ear piece 208. Two cameras are disposed at themedial portion 234, an eye camera and a scene camera, though only theforward looking scene camera 238 is visible in FIG. 2. A reflectivesurface 240 is disposed at the distal end 236. In the example system thereflective surface 240 couples to the arm 230 by way of a ball andsocket arrangement 242 such that the orientation of the reflectivesurface 240 with the respect to the eye of the user may be adjusted. Aswill be discussed more below, the eye camera is also “forward looking”,and the eye camera captures video of the eye of the user as a reflectionin the reflective surface 240. In the example system shown, the ballportion of the ball and socket arrangement 242 is rigidly coupled to thearm 230, while the socket portion of the ball and socket arrangement 242is rigidly coupled to the reflective surface 240, but the locations maybe equivalently reversed.

FIG. 2 shows cable 112 extending from the ear piece 208 to the belt pack110 (the belt pack 110 not shown in FIG. 2). In some example systems,the cable 112 is a HDMI cable, and thus the cable 112 may have a HDMIconnector 244 that mates with a corresponding connector defined in theear piece 208. In other cases, the cable 112 may be hardwired to the earpiece 208, and thus no connector need necessarily be present.

Finally with respect to FIG. 2, in some cases various electronic devicesmay be housed within an interior volume of one or both the of the earpieces 208 and/or 212. Ear piece 208 is shown in partial cut-away viewto show that ear piece 208 defines an internal volume 246 within whichvarious devices may be placed. In the example system, the interiorvolume 246 may house head movement measurement devices, such as asix-axis gyroscope 248. As will be discussed more below, the six-axisgyroscope is not needed for the gaze location determination inaccordance with some example systems, but the six-axis gyroscope mayenable useful features in some situations, such as computer gaming. Inyet still other embodiments, the various devices within the belt pack110 may be combined and/or miniaturized such that the variouselectronics can be housed in one of the ear pieces (or split between theear pieces), thus eliminating the need for the belt pack in yet stillother example systems.

Turning now to FIG. 3, FIG. 3 shows a side elevation view of the headset108 in order to discuss physical relationships of various components ofthe headset 108. In particular, though the parabolic shape of the headstrap 200 is not visible in FIG. 3, any consistent portion of theparabolic shape of the head strap 200 may be considered to reside in andthus define a plane. In the view of FIG. 3, the plane defined by thehead strap 200 is perpendicular to the page, and thus the plane in theview of FIG. 3 would appear as a line. Using a center of the head strap200 as the consistent feature, the plane defined by the headset is shownas plane 300 (shown as a dashed line).

At least the proximal portion 232 and medial portion 234 of arm 230define an elongate body with a central axis 302. The central axis 302 ofthe arm 230 intersects plane 300. Thus, the arm 230 extends away fromthe plane 300 defined by the head strap 200. The length L of the arm 230from the center 304 of the ear piece 208 (or, alternatively, from theintersection of the central axis 302 with the plane 300) to theoutermost extent may be about six inches, although the length may be asshort as five inches or as long as eight inches in other cases.

Still referring to FIG. 3, in various embodiments the angle formedbetween the plane 300 and the central axis 302 of the arm is adjustable,as shown by arrow 306. In some cases, the angle between the plane 300and the central axis 302 of the arm 230 is adjustable in only one degreeof freedom. Thus, for embodiments where the arm 230 is adjustable withrespect to the plane 300 in only one degree of freedom, the arm 230 mayadjust within the plane of the page of FIG. 3. The range of motion maybe approximately 120 degrees, and the arm 230 may be moved upwards ordownward a number of degrees in order to arrange the reflective surface116 with respect to the location of the user's eye. For example,starting initially with the arm 230 in a horizontal orientation(arbitrarily assigned the 0 degree position), in example embodiments thearm 230 may be moved upwards (i.e., toward the head strap 200) as muchas 60 degrees or downward (i.e., away from the head strap 200) as muchas 60 degrees.

In accordance with some operational philosophies, the arm 230 may beadjusted such that the user's eye (in the example systems the right eye)looks through the reflective surface 240 toward distant objects.However, in other cases the arm 230 may be adjusted such that thereflective surface is outside the center of the user's line of sight(e.g., below the user's line of sight) when looking forward. In eitherevent, the eye camera (again, not visible in FIG. 3) is designed,placed, and constructed to capture video of the eye as a reflection onor from the reflective surface 240.

As shown in FIG. 3, the viewing direction of scene camera 238 is awayfrom the plane 300, and the viewing direction can be thought of asdefining a central line 308 (i.e., the optical center of the viewingdirection of the scene camera 238). If the central line 308 of the scenecamera 238 is conceptually extended behind the scene camera 238 (asshown by dashed portion 310), in example embodiments the angle α betweenthe central line 308 and the central axis 302 of the arm may be between0 degrees (e.g., user looks through the reflective surface) and 30degrees, and more particularly about 10 degrees. In the example systemof FIG. 3, dashed line 310 intersects the center 304 of the ear piece208, but such is not strictly required. In at least some embodiments,the angle α between the central line 308 of the scene camera 238 and thecentral axis 302 of the arm is not adjustable; rather, the angle αbetween the central line 308 and the central axis 302 is constant for aparticular headset 108 design.

FIG. 4 shows a partial perspective view of the headset 108 to describingvarious components not shown in the previous views. In particular, FIG.4 shows that ear piece 208 likewise defines an aperture 400 in thepadded portion 222. As with the aperture 226 associated with ear piece212 (both shown in FIG. 2), the aperture 400 in the padded portion 222is the location within which the user's ear will reside during use ofthe headset 108. A speaker (not specifically shown) in the ear piece 208is provided such that audible sounds may be conveyed to the ear of theuser (e.g., sounds generated by a program executing on an attachedcomputer system).

Also visible in FIG. 4 is the eye camera 402. The example eye camera 402is disposed on the medial portion 234 of the arm 230, and moreparticularly the eye camera 402 is disposed on the inside surface 404 ofthe arm (i.e., on the portion closest to the use's face when the headsetis being worn). The scene camera 238, by contrast, is medially disposedbut on an upper surface 406 of the arm 230. The eye camera 402 hasviewing direction toward and that includes the reflective surface 240.As mentioned above, the eye camera 402 creates a video stream depictingthe eye of the user as captured in a reflection on or from thereflective surface 240.

In situations where the user's line of sight looks through thereflective surface 240, the reflective surface may be transparentplastic or transparent glass. In situations where the reflective surface240 is outside the center of the user's line of sight (e.g., below) whenlooking forward, the reflective surface 240 may be transparent plastic,transparent glass, or a glass mirror, and thus in some cases thereflective surface 240 may be referred to as a “hot mirror”. In oneembodiment, and as shown, the reflective surface 240 is circular inshape and may have a diameter ranging from about 0.5 inches up to about1.25 inches. In some cases, the reflective surface 240 may be at least95% translucent, particularly embodiments where the user looks throughthe reflective surface.

Based on operation of the ball and socket arrangement 242, thereflective surface 240 may be adjusted in order to position thereflective surface 240 in such a way that the eye camera 402 capturesreflections of the user's eye from the reflective surface 240.

FIG. 5 shows an overhead view of a portion of the headset 108 in orderto describe further structural relationships of the various componentsof the headset 108. In particular, visible in FIG. 5 is a portion of thehead strap 200 and ear piece 208, including the padded portion 222. Thearm 230 is visible, along with the scene camera 238, the eye camera 402,and the reflective surface 240. The reflective surface 240 may reside inand thus define a plane (and in the view of FIG. 5 the plane isperpendicular to the page). Moreover, the reflective surface 240 definesa central axis 500 perpendicular to the plane (i.e., the central axisperpendicular to the plane defined by the reflective surface 240). Inaccordance with example embodiments, the distance D between the centralaxis 302 of the arm and the central axis 500 of the reflective surface240 is about 1.5 inches. In other embodiments, the distance D may rangefrom about one (1) inch to about (3) inches. That is, in the design of aheadset 108 the distance D may be designed to fall within the range. Inthe embodiments shown in FIG. 5 the reflective surface 240 is notadjustable in the ranges provided, but in other cases the distance D maybe adjustable.

FIG. 5 also shows that in some embodiments the headset 108 may comprisea light system 502. That is, in order to help capture a video stream ofthe user's eye the eye may be illuminated by at least one light source,such as a light emitting diode (LED) 504. In some cases, the lightsystem 502 may comprise two or more LEDs, but in the view of FIG. 5 onlyone LED 504 is visible. As shown, the light system 502 is disposed onthe distal portion 236 of the arm 230, and more particularly between thecentral axis 302 of the arm 230 and the central axis 500 of thereflective surface 240. Each LED of the light system 502 has a beamdirection that shines toward the users eye (and thus toward the headstrap 200, or toward the plane 300 defined by the head strap 200). Thebeam pattern created by each LED has a central axis, such as centralaxis 506 of the beam pattern for LED 504 (the beam pattern itself is notspecifically delineated in FIG. 5).

In example embodiments, the light system 502 produces infrared light. Insome cases, the infrared light created may have a center frequency ofabout 850 nanometers, but other wavelengths of infrared light may alsobe used. In other cases, the light system may create visible light, butinfrared light is less distracting to the user. In some cases, the lightsystem 502 may create between about 50 and about 250 milliwatts ofluminous power, with each LED carrying a respective portion of theoverall luminous power output.

FIG. 6 shows a perspective view of the headset 108 as worn by user 104.In particular, visible in FIG. 6 is the head strap 200, ear piece 212,as well as the arm 230 and components associated therewith. Although theLEDs are not visible in the view of FIG. 6, FIG. 6 does illustrate thebeam patterns associated with an example two LEDs. In particular, oneLED of the light system 502 creates a first beam pattern 600 on theuser's face, and likewise another LED of the light system 502 creates asecond beam pattern 602 on the user's face. In most cases, the upper orfirst beam pattern 600 is created by an LED of the light system 502 thatis at a higher elevation, and the lower or second beam pattern 602 iscreated by a LED of the light system 502 at a lower elevation, but therelationship of elevation of a beam pattern and elevation of arespective LED is not constrained to be so related.

Each beam pattern defines a central axis. Thus, one beam pattern definescentral axis 506, while the example second beam pattern defines acentral axis 604. In accordance with example embodiments, the angle λbetween the central axes of the two example beam patterns may be about45 degrees. In some cases, LEDs are selected and mounted in the arm 230such that the beam patterns define sufficient spot size that the twobeam patterns overlap on the user's face. Moreover, the headset 108 ispreferably designed and constructed, and/or adjusted if needed, suchthat the overlapping areas of the beam patterns correspond to thelocation of the user's eye, as shown in FIG. 6. However, so long as theeye resides within one or both the example beam patterns 600 and 602,there should be sufficient illumination for the eye camera to capturevideo of the eye (and the pupil) in the reflection on or from thereflective surface.

Headset Electronic Devices and Connections

FIG. 7 shows an electrical block diagram of various components of theheadset system 102. In particular, FIG. 7 shows the headset 108 and beltpack 110 coupled by way of cable 112. Within the headset 108 resides ahost of electrical and/or electronic components, some of which have beenpreviously introduced. Headset 108 comprises the light system 502, scenecamera 238, eye camera 402, right speaker 700, left speaker 702, andgyroscope 248. Data flows from headset 108 to the belt pack 110, and inthe example embodiments the data are routed through and handled by theheadset communication control board 704 (hereafter just control board704). Communication from the belt pack 110 to the headset is alsocontemplated (e.g., camera control commands, audio for the speakers),and thus the control board 704 likewise handles receiving any suchcommands and data from the belt pack 110.

Right speaker 700 and left speaker 702 may be any suitable set ofspeakers for delivering audio to the user, such as circular speakerswith eight or four ohm characteristic impedance.

Scene camera 238 in the example embodiments is a digital camera thatproduces a video stream at a frame rate of about 30 frames per second.As discussed above, the scene camera is arranged on the headset 108 toproduce a video stream of the scene in front of the user. In someexample systems, the scene camera 238 has 720 lines of verticalresolution and is progressive scan (i.e., is a 720p camera), but otherresolutions and frame rates may be equivalently used. The scene camerain various embodiments is designed to capture and record light from thevisible spectrum, and thus either the camera array is designed to beresponsive only to the light in the visible spectrum, or the cameraarray has optical filters that only allow from the visible spectrum toreach the camera array. In accordance with the example embodiments, thescene camera sends the video stream to the belt pack 110 in the form ofa stream of JPEG encoded files, one JPEG image per frame.

Eye camera 402 in the example embodiments is a digital camera thatproduces a video stream at a frame rate of about 60 frames per second.As discussed above, the eye camera 402 is arranged on the headset 108 toproduce a video stream of the eye of the user, and as illustrated theright eye of the user. Further in the embodiments shown, the eye camera402 is arranged to capture the images to produce the video stream as thereflection of the eye in the hot mirror or reflective surface 240. Inother embodiments, the eye camera 402 may be arranged to capture theimages to produce the video stream directly (i.e., a direct capture, notas a reflection).

In the example system the eye camera 402 has a resolution of 640×480pixels. Inasmuch as the light system 502 shines infrared light on theuser's eye, the eye camera 402 in such embodiments is designed tocapture infrared light, and thus either the camera array is designed tobe responsive only to infrared light, or the camera array has opticalfilters that only allow infrared light to reach the camera array. Inaccordance with the example embodiments, the eye camera sends the videostream to the belt pack 110 in a raw image format in the form of RGBencoded information, sometimes referred to as Bayer RGB. In cases wherethe light system 502 illuminates the eye with visible light, or perhapsthe headset system 102 is operated in an environment where sufficientvisible ambient light exists, the eye camera 402 may instead capture andrecord visible light.

Light system 502 has been previously discussed to contain one or moreLEDs, and in some cases two LEDs, that illuminate the eye with infraredlight. In other cases, different type light sources may be used (e.g.,incandescent bulbs), and light other than infrared may also be used(e.g., visible light).

Optional gyroscope 248 resides within the headset (e.g., in the earpiece 208 as shown in the FIG. 2) and communicatively couples to thebelt pack 110 by way of the control board 704. Gyroscope 248 may bepresent to determine head movement of the user 104, but again it isnoted that head movement measurements are not required to relate pupilcenter position the video stream of the eye camera to gaze location inthe video stream of the scene camera in at least some embodiments. In anexample system, the headset 108 comprises a six-axis gyroscope (e.g., acombination accelerometer and gyroscope). In this way, head orientationand movement, including yaw, pitch and roll, may be ascertained. Inother cases, head movement may be determined, with less accuracy, with athree-axis accelerometer. Likewise, while possible to determine headmovement in a broad sense with a gyroscope alone, using a six-axisgyroscope, very precise head movement changes can be detected andquantified.

Turning now to the belt pack 110, belt pack 110 may comprise anelectronics enclosure 707 defining an interior volume within which thevarious components are housed. Within the enclosure 7070 may reside thebelt pack communication control board 706 (hereafter just control board706), processor board 708, and battery 710. Battery 710 providesoperational power for all the components of the headset system,including the components of the belt pack 110. By way of one or moreconductors 712 in the cable 112, the electrical components of theheadset 108 are provided power from the battery 710 as well. The battery710 may take any suitable form, such as one or more single usebatteries, or a rechargeable battery or set of batteries (e.g.,lithium-ion batteries).

Control board 706 is a companion board to the control board 704 in theheadset 108, and the control boards 704 and 706 work together toexchange data and commands between the headset 108 and the belt pack110. In particular, by way of conductors 714 within the cable 112, thecontrol board 704 may send to control board 706: frames of the videostream captured by the scene camera 238; frames of the video streamcaptured by the eye camera 402; roll, pitch, and yaw measurements madeby the gyroscope 248. Moreover, the control board 706 may send to thecontrol board 704: control commands to the cameras 238 and 402; andaudio for the right and left speakers 700 and 702. In example systems,the control boards may be communicatively coupled by way of a cable 112being an HDMI cable, and thus the control boards 704 and 706 may becoupled by conductors 714 in the cable 112, with the number ofconductors being up to 19 conductors.

Processor board 708 is communicatively coupled to the control board 706.The processor board 708 comprises a processor 716 coupled to nonvolatilememory 718, volatile memory 720, and a wireless transceiver 722.Nonvolatile memory 718 may comprise any suitable nonvolatile memorysystem capable of holding programs and data for an extended period oftime, such as battery backed random access memory (RAM) or read onlymemory (ROM) (e.g., flash memory). Volatile memory 720 may be theworking memory for the processor 716. In many cases, programs and datastored on the non-volatile memory 718 may be copied to the volatilememory 720 for execution and access by the processor 716. In some cases,the volatile memory 720 may be dynamic RAM (DRAM), or synchronous DRAM(SDRAM). One or both the nonvolatile memory or the volatile memory maybe a computer-readable medium upon which computer program instructionsand data may be stored. Other examples of computer-readable mediumsinclude CDROM, DVDs, magnetic storage disks, and the like.

Still referring to FIG. 7, the example processor board 722 furthercomprises a wireless transceiver 722 coupled to the processor 716. Thewireless transceiver 722 enables the processor board 708, and thusprograms executing on the processor 716, to communicate wirelessly withother computer systems, such as computer system 106 (FIG. 1). In examplesystems, the wireless transceiver 722 implements a wirelesscommunication protocol, such as one of the IEEE 802.11 family ofwireless protocols. The wireless transceiver 722 may appear as awireless router operating under the IEEE 802.11 protocol, and to whichthe computer system 106 may communicatively couple and exchange datawith the programs executing on the processor 716. Other wirelesscommunication protocols are also contemplated, such as a Bluetoothprotocol connection system.

In the example system, the processor board 708 is an i.MX 6 seriesdevelopment board (also known as a Nitrogen6X board) available fromBoundary Devices of Chandler, Ariz. Thus, the example processor 716 maybe a 1 Giga Hertz ARM® core processor integrated with the i.MX 6 seriesdevelopment board running a Linux operating system. Likewise, the i.MX 6series development board may come with volatile memory 718, nonvolatilememory 720, and the wireless transceiver 722. However, otherimplementations of the processor board 708 are contemplated, such asintegrating the components from various manufacturers, or creation of anapplication specific integrated circuit to implement the variousfeatures.

Software Operation

The specification now turns to operational aspects of the headset system102. Since the functionality is implemented, at least in part, by way ofsoftware executing on the processor 716, description of the operationalaspects are thus a description of the software operation. The discussionbegins with a discussion of calibration of the system.

Calibration

A first example step in using headset system 102 is a calibrationprocedure. That is, a procedure is performed from which a homography iscreated, where the homography relates pupil center position to gazelocation in the scene. In this example procedure, the head position ofthe user need not be known or measured, and thus the presence ofgyroscope 248 is not strictly required.

FIG. 8 shows an example set 800 of calibration points or calibrationfeatures. During a calibration procedure, calibration features aresequentially displayed in the scene in front of the user, and the useris asked to look at each of the calibration features as they aredisplayed. For example, calibration feature 802 may be initiallydisplayed, and the user looks at the calibration feature 802 as itexists in the scene. After a predetermined period of time (e.g., fiveseconds), in the example calibration method the calibration feature 802is removed from the scene, calibration feature 804 is displayed, and theuser looks at calibration feature 804. The cycle is repeated for eachcalibration feature, and in the example shown in FIG. 8 nine suchcalibration features are ultimately shown.

In some cases, the calibration features may be displayed on the displaydevice 116 of the computer system 106. That is, programs executing inthe belt pack 110 may communicate with the computer system 106 and causethe computer system 106 to show the calibration features. However,calibration using calibration features shown on a display device of acomputer system is not strictly required, even when the ultimate goal iscursor control. That is, in some cases the calibration features are notshown on a display device of a computer system. The calibration featuresmay be shown in the scene in front of the user in any suitable form,such as on the display device of a computer system different than theone on which cursor control is desired, or by displaying a poster boardin front of the user and sequentially revealing the calibrationfeatures.

Moreover, while the example calibration implements nine calibrationfeatures, fewer calibration features (e.g., two calibration features atopposing corners of the scene, or four calibration features at the fourcorners of the scene) may be used. Likewise, greater than ninecalibration features may be used. Further still, sequentially revealingthe calibration features may not be required. That is, all thecalibration features may be simultaneously displayed, and the usersequentially looks at each calibration features in a predetermined order(e.g., upper left to upper right, then middle left to middle right,etc.).

FIG. 9 shows, in block diagram form, an example calibration method, someof which may be implemented by programs executing on a processor 716.The example method starts by displaying calibration features in thescene in front of the user (block 900). As shown, the example method hastwo parallel paths, one associated with the scene camera (the leftbranch), and one associated with the eye camera (the right branch). Thediscussion starts with the branch associated with the eye camera.

In particular, during the period of time that the calibration featuresare being displayed, the eye camera 402 creates a video streamcomprising a plurality of frames. In some cases, the eye camera 402produces about 60 frames per second, and each frame is provided to thesoftware executing on the processor 716 within the belt pack 110. Foreach frame, the program executing on the processor determines the pupilposition, and more particularly pupil center position, within the frame.Any suitable method to determine pupil center position may be used, suchas those described in the co-pending and commonly assigned patentapplication Ser. No. 13/339,543 titled “System and method of moving acursor based on changes in pupil position” (published as US Pub. No.2013/0169532), which application is incorporated by reference herein asif reproduced in full below. Based on the pupil center positiondetermination within each frame, the program creates a list of pupilcenter position coordinates (i.e., X,Y coordinates), and associated witheach X,Y coordinate is a frame number (block 902). The frame number foreach sequential frame is a sequentially assigned number by the eyecamera 402. The frame number provides a time-like reference for each X,Ycoordinate within the series of frame numbers, but may not provide anindication of actual local time. As will be discussed more below, theframe numbers may help logically relate the X,Y coordinates to thesequence of calibration features.

Once all the calibration features have been shown, and the processor 716has created the list of pupil center position coordinates (and framesnumbers) (again block 902), the next step in the illustrative method isapplying the data from the list to a clustering algorithm to identifyclusters (block 904). That is, the data from the list of X,Y coordinatesis clustered by the processor 716 based on the X,Y coordinates (e.g.,distance-based clustering) to identify groups or clusters within thedata. Any suitable clustering algorithm may be used to perform theclustering. For example, the clustering algorithm may be the K-means++algorithm, where the K-means++ algorithm may take as input the list ofcoordinates and a target number of clusters (e.g., nine).

Though in the example method nine calibration features are shown, inmost cases more than nine clusters will be identified in the data by theclustering algorithm. The additional clusters may be present for anynumber of reasons, such as inattention or distraction of the user duringthe calibration procedure, errors in pupil center positiondetermination, and/or inefficiencies in the clustering algorithm itself.Regardless, the next step in the illustrative method may be to removeoutlier clusters (e.g., remove based on the number of points in thecluster, assuming clusters created from the user looking at featuresother than calibration features will have fewer members) until thetarget number of clusters remain (block 906). Again, for calibrationprocedures using nine calibration features, nine clusters would beexpected.

Turning now to the left branch of the example method, the branchassociated with the scene camera. In particular, during the period oftime that the calibration features are being displayed, the scene camera238 creates a video stream comprising a plurality of frames. In somecases, the scene camera 402 produces about 30 frames per second, andeach frame is provided to the software executing on the processor 716within the belt pack 110. For each frame, the program executing on theprocessor determines the center position of the calibration featurewithin the frame. Any suitable method to determine center position ofthe calibration features may be used. In the case of calibrationfeatures in the form round dark colored dots sequentially presented, thepupil center position determination algorithms may be used.

Based on the center position determination within each frame, theprogram creates a list of calibration feature center positioncoordinates (i.e., X,Y coordinates), and associated with each X,Ycoordinate is a frame number (block 908). The frame number for eachsequential frame is a sequentially assigned number by the scene camera238. The frame number provides a time-like reference for each X,Ycoordinate within the series of frame numbers, but may not provide anindication of actual local time. Moreover, there may be no numericalrelationship between frame numbers as assigned by the scene camera 238and frame numbers assigned by the eye camera 402 as discussed above.Again as will be discussed more below, the frame numbers may helplogically relate the X,Y coordinates to the sequence of calibrationfeatures.

Once all the calibration features have been shown, and the processor 716has created the list of center position coordinates (and frames numbers)(again block 908), the next step in the illustrative method is applyingthe data from the list to a clustering algorithm to identify clusters(block 910). That is, the data from the list of X,Y coordinates isclustered by the processor 716 based on the X,Y coordinates to identifygroups or clusters within the data. Any suitable clustering algorithmmay be used to perform the clustering, such as the K-means++ algorithmmentioned above.

Though again in the example method nine calibration features are shown,in most cases more than nine clusters will be identified in the data bythe clustering algorithm. The additional clusters may be present for anynumber of reasons, such as the background on which the calibrationfeatures are shown not spanning the entire scene captured by the scenecamera 238, errors in center position determinations, and/orinefficiencies in the clustering algorithm. Regardless, the next step inthe illustrative method may be to remove outlier clusters until thetarget number of clusters remains (block 912). For calibrationprocedures using nine calibration features, nine clusters would beexpected in the data created from the scene camera frames.

Still referring to FIG. 9, the next step in the illustrative method isto correlate the calibration location clusters (i.e., clusters createdfrom the frames of the scene camera) with pupil center location clusters(i.e., clusters created from the frames of the eye camera) (block 914).In some example methods, the correlation of the clusters may be in atime-like manner. That is, the data created with respect to both the eyecamera and the scene camera includes not only center positioncoordinates, but also frame numbers assigned by the respective cameras.While no absolute time information is necessarily included or embeddedin the frame numbers, the frame numbers will imply a time order at leastwithin the frames numbers themselves. That is, the frame numbers createdand assigned by the eye camera imply an order of the frames, andlikewise frame numbers created and assigned by the scene camera imply anorder of the frames. Thus from the frame numbers an order of appearanceof the calibration location clusters may be determined, and likewise anorder of appearance of the pupil center location clusters may bedetermined. From the order of appearance, and the knowledge of the orderof presentation of the calibration features, the relationship of theclusters within each data set may be determined.

From the relationship of the clusters, a homography is created whichcorrelates pupil center position in the frames of the eye camera to gazelocation in the frames of the scene camera (block 916). That is, data inthe form of pupil position is applied to the homography, and thehomography mathematically transforms the pupil center location to apixel position in the frames of the scene camera—the gaze location ofthe user. Note that the example system can determine the gaze locationof the user within the scene in front of the user regardless of wherethe user is looking. If the user is looking at a scene within the room(that does not include a computer monitor), the processor 716,determining pupil center positions and applying the homography, candetermine where in the room the user is looking in real-time. Inasmuchas the example eye camera produces about 60 frames per second, and theexample scene camera produces about 30 frames per second, each gazelocation within frame of the scene camera may be based an average of twoor more pupil center positions.

Gaze Direction Determination

A headset system 102 calibrated as described above may have manybeneficial uses. For example, a user wearing a headset system 102calibrated as above may be a test subject regarding placement ofadvertising. With the eye and scene cameras operational, and theprocessor 716 in the belt pack 110 determining gaze location within thescene in real-time, the user may be allowed to shop (e.g., in a grocerystore). Viewing patterns and habits of the user 104 may be determined.For example, the processor 716 may make the gaze location determination,and send the gaze locations determined along with the video stream fromscene camera (the sending by the wireless transceiver 722) to aresearcher in proximity. The researcher may then glean information aboutviewing patterns, and how to improve product placement. Other exampleuses are possible.

Cursor Position Control

In many cases, however, the gaze location determination of the headsetsystem 102 is used in the context of a computer display device. Not onlymight the gaze location determinations be useful from a researchstandpoint (e.g., determining hot zones for advertising placement ondisplay devices of computer systems), but the gaze locationdeterminations may also find use in cursor position control,particularly for the disabled.

In an absolute sense, the headset system 102 can determine gaze locationwithin a scene whether or not there is a computer display device in thescene. However, in order to use the gaze location determinations forcursor control on the display device of the computer system, additionalinformation may be needed. In accordance with example embodiments, inorder to relate gaze location within a scene that includes a displaydevice to gaze location on the display device, programs executing on theprocessor 716 in the belt pack 110 also identify the display devicewithin the scene. The specification now turns to example methods ofidentifying the display device within the scene in reference to FIG. 10.

FIG. 10 shows a frame 1000 of the video stream created by the scenecamera. Within the frame 1000 resides a display device 1002. Inaccordance with example methods, the location/size of the display device1002 may be identified within the scene by the processor 716. Forexample, the display device 1002 within the scene may show a border 1004of predetermined color (e.g., blue). In one example embodiment, theborder 1004 may be created within the active area of the display device.In another example embodiment, the border 1004 may be placed around theactive area of the display device 1002 (e.g., blue tape placed on thecase around the active area of the display device). The border 1004(regardless of form) may be identified using machine vision softwarerunning on the processor 716 of the headset system 102. FIG. 10illustratively shows a gaze location 1006 within the display device1002.

FIG. 11 shows a frame 1100 of the video stream created by the scenecamera to depict an alternate method of identifying the display devicewithin the scene. In particular, within the frame 1100 resides a displaydevice 1102. In accordance with example methods, the display device 1102may show a plurality of indicia of corner locations (labeled 1104A,11048, 1104C, and 1104D). In the example of FIG. 11, the indicia ofcorner locations are shown as “+” symbols, but any identifiable symbolmay be used, and in fact each indicia of a corner location may be adifferent symbol. In one example embodiment, the indicia of cornerlocations 1104 may be created within the active area of the displaydevice. In another example embodiment, the indicia of corner locations1104 may be placed around the active area of the display device 1102(e.g., decals placed on the case around the active area). The indicia ofcorner locations (regardless of form) may be identified using machinevision software running on the processor 716 of the headset system 102.FIG. 11 illustratively shows a gaze location 1106 within the displaydevice 1002.

FIG. 12 shows a method in accordance with example embodimentsimplementing cursor control, where some of the example steps may beperformed by software executing on the processor 716 of the headsetsystem 102. In particular, the method starts by the processor receiving,from the eye camera, a video stream comprised of multiple frames, thevideo stream depicting the eye of the user (block 1200). Further, theprocessor receives, from the scene camera, a video stream comprised ofmultiple frames, the video stream depicting the scene in front of theuser (block 1202). The processor determines pupil center position withineach frame (block 1204), and calculates the gaze location within thescene based on the user's pupil center position (block 1206). That is,in block 1306 the processor 716 applies the homography determined in thecalibration phase.

Next, the processor 716 identifies a display device visible in the scene(block 1208). Identifying the display device may take any suitable form,such as identifying the border 1004 of predetermined color (discussedwith respect to FIG. 10), or identifying the indicia of corner locations(discussed with respect to FIG. 11). Once the location of the displaydevice within the scene is has been determined, the processor 716 of theheadset system 102 sends an indication of the location of the displaydevice within the scene to a second computer system (i.e., the computersystem on which the cursor is to be controlled) (block 1210). Forexample, the processor 716 may send four pixel locations representingthe four corners of the display device within the scene. Next, theprocessor 716 sends an indication of the resolution of the scene camerato the second computer system (block 1212), and sends an indication ofthe gaze location within the scene to the second computer system (block1214). In most cases sending of the information to the second computersystem is in real-time with creation and/or receiving of the frames fromthe eye camera and/or the scene camera. In most cases, the examplemethod then begins anew with receiving frames of the video stream of theeye camera (again block 1200). The balance of the cursor control may beimplemented on the computer system on which the cursor is to becontrolled.

FIG. 13 shows a method that may be implemented on a second computersystem (i.e., the computer system on which the cursor is to becontrolled), and which method may be implemented in software executingon the processor of the second computer system. In particular, theexample method begins by receiving an indication of the location of thedisplay device within a scene (block 1300). For example, the secondcomputer system may receive the four pixel locations representing thefour corners of the display device sent from processor 716. Next, theexample method receives an indication of the resolution of the scenecamera of the scene (block 1302), and receives an indication of the gazelocation within the scene (block 1304). From the information providedfrom the headset system 102, the example method may then comprisetransforming gaze location in the scene to the coordinate system of thesecond computer system (i.e., the coordinate system of the displaydevice) by performing a perspective transformation (block 1306).Finally, the second computer system then moves the cursor to theindicated gaze location on the display device (block 1308). In mostcases, the example method then begins anew with receiving informationfrom the headset system 102 (again blocks 1300, 1302, etc.) In examplesystems, the combination of the headset system 102 making gaze locationdeterminations, and the second computer system receiving gaze the gazelocation data (and other information) may implement cursor movementcontrol in real-time with the pupil movement of the user.

Before proceeding, it is noted that the order of the various methodsteps in FIGS. 12 and 13 may be changed without adversely affectingoperation of the headset system 102 and related second computer system(e.g., computer system 106). The various pieces of information used toperform the perspective transformation may be sent and/or received inany order, and thus FIGS. 12 and 13 shall not be read to imply a preciseorder of the steps.

Head Position Determination

The specification now turns to a discussion regarding calibration andmeasuring a user's head movement. In the example systems, in addition toeye movement (e.g., pupil center position tracking), head movement ismay also be tracked. In example systems the headset 108 may comprise asix-axis gyroscope 248 (e.g., a combination accelerometer andgyroscope). Thus, head orientation and movement, including yaw, pitchand roll, may be ascertained.

In example systems, a user undergoes the calibration of gaze location asdiscussed above, and a homography is created. For purposes of thediscussion from this point forward in the specification, the homographyrelated to gaze location will be referred to as the gaze locationhomography. With the gaze location homography created, the user is nextasked to perform head movement while continuing to keep gaze at acalibration feature on the display device for a period of time (e.g.five seconds). In one embodiment, the calibration feature may be thelast of the calibration features discussed above. During the time theuser is gazing at the calibration feature and performing head movement,the system collects two-dimensional directional vector data indicatinghead movement (e.g., collects from the six-axis gyroscope 248).

After the head movement data is collected, the processor 716 calculatesa second homography, termed herein a head homography, which relates headposition to pupil center position. The head homography is created fromtwo sets of values comprising five two-dimensional head directions andthe corresponding two-dimensional pupil center positions in the imagefrom eye camera 402.

The five two-dimensional head directions are:

X Y  (1)

+X +Y  (2)

−X −Y  (3)

−X +Y  (4)

−X −Y  (5)

Where X Y (1) represents the initial head position of the user; −X is aminimum value for X; +X is a maximum value for X; −Y is a minimum valuefor Y; and +Y is a maximum value for Y.

The example systems implementing a gyroscope 248 may find use withcomputer programs that implement a first-person camera view on thedisplay device, such as in a first-person shooter game. In particular,the processor 716 may periodically (e.g., once each frame of the eyecamera) measure head movement by recording data from the six-axisgyroscope 248. The measurement consists of three angles: yaw; pitch; androll. The measurement is first subtracted from the value of an earliermeasured three-dimensional head direction value that was recorded duringcalibration. The difference between the earlier measuredthree-dimensional head direction value and the last recorded measurement(i.e., the previously recorded frame) is calculated. If there is no lastrecorded measurement, then the current frame calculation is the measuredvalue.

The processor 716 may then send a value or set of values to the secondcomputer system representing the change or difference in head position.Based on the head position, or change in head position) determined, thesecond computer system implementing the first-person camera may move thefirst-person camera view on the display device based on the differencebetween the earlier measured three-dimensional head direction value andthe last recorded measurement. Stated more directly, when using thedescribed system with computer programs showing first-person cameraviews on a display device, head movement by the user alone (andindependent of eye movement) causes changes in the camera view on thedisplay device. Other modes of first-person camera control may augmentthe head movement. For example, in a first-person shooter gameenvironment, a player may wish to turn 180 degrees in the game withoutturning around in real life. Making a mouse movement, in addition tomoving his eyes and head, may help accomplish this. In anotherembodiment, a supplementary instruction may be mapped to a key on aninput device, such as a keyboard.

Eye Movement Mapping to in-Game Aim

As discussed above, the combination of the headset system 102 and theexample computer system 106 can determine an absolute gaze location onthe display device (independent of head position). The absolute gazelocation on the display device may be translated into an aiming pointfor an in-game weapon, and if a player then shoots the weapon, it will“shoot” towards the user's absolute gaze location on the display device.In other words, the center point of an aim need not be the center pointof the in-game camera view.

In the specification and claims, certain components may be described interms of algorithms and/or steps performed by software that may beprovided on a non-transitory storage medium (i.e., other than a carrierwave or a signal propagating along a conductor). The various embodimentsalso relate to a system for performing various steps and operations asdescribed herein. This system may be a specially-constructed device suchas an electronic device, or it may include one or more general-purposecomputers that can follow software instructions to perform the stepsdescribed herein. Multiple computers can be networked to perform suchfunctions. Software instructions may be stored in any computer readablestorage medium, such as for example, magnetic or optical disks, cards,memory, and the like.

From the description provided herein, those skilled in the art arereadily able to combine software created as described with appropriategeneral-purpose or special-purpose computer hardware to create acomputer system and/or computer sub-components in accordance with thevarious embodiments, to create a computer system and/or computersub-components for carrying out the methods of the various embodimentsand/or to create a non-transitory computer-readable medium (i.e., not acarrier wave) that stores a software program to implement the methodaspects of the various embodiments.

References to “one embodiment,” “an embodiment,” “some embodiment,”“various embodiments,” or the like indicate that a particular element orcharacteristic is included in at least one embodiment of the invention.Although the phrases may appear in various places, the phrases do notnecessarily refer to the same embodiment.

The above discussion is meant to be illustrative of the principals andvarious embodiments of the present invention. Numerous variations andmodifications will become apparent to those skilled in the art once theabove disclosure is fully appreciated. It is intended that the followingclaims be interpreted to embrace all such variations and modifications.

What is claimed is:
 1. A method comprising: receiving, by a firstcomputer system, a first video stream depicting an eye of a user, thefirst video stream comprising a first plurality of frames; receiving, bythe first computer system, a second video stream depicting a scene infront of the user, the second video stream comprising a second pluralityof frames; determining, by the computer system, pupil position withinthe first plurality of frames; calculating, by the first computersystem, gaze location in the second plurality of frames based on pupilposition within the first plurality of frames; and sending an indicationof the gaze location to a second computer system, the second computersystem distinct from the first computer system, and the sending inreal-time with creation of the first video stream.
 2. The method ofclaim 1 further comprising, prior to calculating and sending:calibrating a relationship between pupil position within the firstplurality of frames and gaze location in the second plurality of frames,the calibrating by: displaying a plurality of calibration features inthe scene in front of the user; determining, by the first computersystem, location of the calibration features within the second pluralityof frames; relating, by the first computer system, pupil position in thefirst plurality of frames to location of the calibration features in thesecond plurality of frames; and thereby creating, by the first computersystem, a homography that relates pupil position in the first pluralityof frames to gaze location in the second plurality of frames.
 3. Themethod of claim 2 wherein displaying the calibration features furthercomprises revealing the plurality of calibration features sequentially,the revealing of the plurality of calibration features not on a computermonitor.
 4. The method of claim 2 wherein displaying the calibrationfeatures further comprises sequentially revealing the plurality ofcalibration features on a display device of a computer system.
 5. Themethod of claim 2 wherein relating further comprises: clustering, by thefirst computer system, indications of pupil position derived from thefirst plurality of frames, the clustering creates a plurality of pupilclusters; clustering, by the second computer system, indications oflocation of the calibration features in the second plurality of frames,the clustering indications of location creates a plurality of featureclusters; and correlating the pupil clusters to the feature clusters. 6.The method of claim 5 wherein, prior to correlating, discarding at leastone cluster from the plurality of pupil clusters.
 7. The method of claim5 wherein correlating further comprises associating appearance ofcalibration features in the scene with pupil clusters based on framenumbers associated with each pupil position determination.
 8. The methodof claim 1 further comprising: identifying, by the first computersystem, a computer monitor visible in the second plurality of frames;sending an indication of the location of the computer monitor within thesecond video stream, the sending to the second computer system; andsending an indication of a resolution of the second video stream, thesending to the second computer system.
 9. The method of claim 8 whereinidentifying further comprises at least one selected from the groupconsisting of: identifying a border of predetermined color shown on thecomputer monitor; identifying four indicia of corner locations shown onthe computer monitor; and identifying a border of predetermined colorthat at least partially circumscribes the computer monitor.
 10. Themethod of claim 8 further comprising: receiving, by the second computersystem, the indication of gaze location, the indication of location ofthe display device within the second video stream, and the indication ofresolution of the second video stream; performing, by the secondcomputer system, a perspective transformation resulting in a desiredlocation on the display device; and moving, by the second computersystem, a cursor shown on the computer monitor to the desired location.11. A computer system comprising: a processor; a memory coupled to theprocessor; and a wireless communication subsystem coupled to theprocessor; wherein the memory storing a program that, when executed bythe processor, causes the processor to: receive a first video streamdepicting an eye of a user, the first video stream comprising a firstplurality of frames; receive a second video stream depicting a scene infront of the user, the second video stream comprising a second pluralityof frames; determine pupil position within the first plurality offrames; calculate gaze location in the second plurality of frames basedon pupil position within the first plurality of frames; and send, by wayof the wireless communication subsystem, an indication of the gazelocation to a remote computer system, and the send in real-time withreceipt of the first video stream.
 12. The computer system of claim 11wherein the program further causes the processor to: calibrate arelationship between pupil position within the first plurality of framesand gaze location in the second plurality of frames, the calibration bycausing the processor to: determine location of a plurality ofcalibration features within the second plurality of frames; relate pupilposition in the first plurality of frames to location of the calibrationfeatures in the second plurality of frames; and thereby create ahomography that relates pupil position in the first plurality of framesto gaze location in the second plurality of frames.
 13. The computersystem of claim 12 wherein as part of calibration, the program furthercauses the processor to cause a display device of the remote computersystem to display the plurality of calibration features.
 14. Thecomputer system of claim 12 wherein when the processor relates pupilposition to the location of the calibration features, the program causesthe processor to: cluster indications of pupil position derived from thefirst plurality of frames, the clustering creates a plurality of pupilclusters; cluster indications of location of the calibration features inthe second plurality of frames to create a plurality of featureclusters; and correlate the pupil clusters to the feature clusters. 15.The computer system of claim 14 wherein the program further causes theprocessor, prior to the correlation, to discard at least one clusterfrom the plurality of pupil clusters.
 16. The computer system of claim14 wherein when the processor correlates, the program causes theprocessor to associate appearance of calibration features in the scenewith pupil clusters based on frame numbers associated with each pupilposition determination.
 17. The computer system of claim 11 wherein theprogram further causes the processor to: identify a display devicevisible in the second plurality of frames; send an indication of thelocation of the display device within the second video stream, thesending to the second computer system; and send an indication of aresolution of the second video stream, the sending to the secondcomputer system.
 18. The computer system of claim 17 wherein when theprocessor identifies, the program further causes the processor to atleast one selected from the group consisting of: identify a border ofpredetermined color shown on the display device; identify four indiciaof corner locations shown on the display device; and identify a borderof predetermined color that at least partially circumscribes the displaydevice.
 19. A non-transitory computer readable medium storing a programthat, when executed by a processor, causes the processor to: receive anindication of gaze location of a user, the gaze location within a scene,and the scene including a display device associated with the processor;receive an indication of resolution of the scene; receive an indicationof location of the display device within the scene; perform aperspective transformation resulting in a desired location within thedisplay device; and move a cursor shown on the computer monitor to thedesired location.
 20. The non-transitory computer readable medium ofclaim 19 wherein when the processor receives the indication of gazelocation, the program causes the processor to receive a pixel locationwithin the scene.
 21. The non-transitory computer readable medium ofclaim 19 wherein when processor receives the indication of resolution ofthe scene, the program causes the processor to receive an indication ofpixel width and pixel height of the scene.
 22. The non-transitorycomputer readable medium of claim 19 wherein when processor receives theindication of location of the display device within the scene, theprogram causes the processor to receive four pixel locations within thescene corresponding to four corners of the display device within thescene.
 23. The non-transitory computer readable medium of claim 19wherein the program further causes the processor to: receive anindication from a headset system to begin a calibration procedure; andresponsive to the indication from the headset display a plurality ofcalibration features on the display device.
 24. The non-transitorycomputer readable medium of claim 23 wherein when the processordisplays, the program causes the processor to sequentially display ninecalibration features at predetermined locations on the display device.